AITopics | action profile

Collaborating Authors

action profile

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Solving Graph-based Public Good Games with Tree Search and Imitation Learning

Neural Information Processing SystemsApr-24-2026, 16:32:06 GMT

Public goods games represent insightful settings for studying incentives for individual agents to make contributions that, while costly for each of them, benefit the wider society. In this work, we adopt the perspective of a central planner with a global view of a network of self-interested agents and the goal of maximizing some desired property in the context of a best-shot public goods game. Existing algorithms for this known NP-complete problem find solutions that are sub-optimal and cannot optimize for criteria other than social welfare. In order to efficiently solve public goods games, our proposed method directly exploits the correspondence between equilibria and the Maximal Independent Set (mIS) structural property of graphs. In particular, we define a Markov Decision Process which incrementally generates an mIS, and adopt a planning method to search for equilibria, outperforming existing methods. Furthermore, we devise a graph imitation learning technique that uses demonstrations of the search to obtain a graph neural network parametrized policy which quickly generalizes to unseen game instances. Our evaluation results show that this policy is able to reach 99.5% of the performance of the planning method while being three orders of magnitude faster to evaluate on the largest graphs tested. The methods presented in this work can be applied to a large class of public goods games of potentially high societal impact and more broadly to other graph combinatorial optimization problems.

artificial intelligence, equilibria, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

056e8e9c8ca9929cb6cf198952bf1dbb-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 09:17:17 GMT

action profile, agent, artificial intelligence, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

Queue Up Your Regrets: Achieving the Dynamic Capacity Region of Multiplayer Bandits

Neural Information Processing SystemsApr-24-2026, 09:17:13 GMT

Consider N cooperative agents such that for T turns, each agent n takes an action an and receives a stochastic reward rn (a1,...,aN). Agents cannot observe the actions of other agents and do not know even their own reward function. The agents can communicate with their neighbors on a connected graph Gwith diameter d(G). We want each agent nto achieve an expected average reward of at least λn over time, for a given quality of service (QoS) vector λ. AQoS vector λis not necessarily achievable.

agent, algorithm, artificial intelligence, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Add feedback

db2d2001f63e83214b08948b459f69f0-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 07:08:39 GMT

algorithm, convergence rate, potential function, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Game Theory (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Your Title

Your Name

Neural Information Processing SystemsFeb-7-2026, 14:23:14 GMT

Consider a team of cooperative players that take actions in a networkedenvironment. At each turn, each player chooses an action and receives a reward that is an unknown function of all the players' actions. The goal of the team of players is to learn to play together the action profile that maximizes the sum of their rewards. However, players cannot observe the actions or rewards of other players, and can only get this information by communicating with their neighbors.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

056e8e9c8ca9929cb6cf198952bf1dbb-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 07:17:39 GMT

This search does not affect the computational complexity, which is O(νnDE +SE) for agent n that computes DE parallel consensus steps and goes over a listofSE actionprofiles. Intuitively,wewouldneedE KN tofindtheoptimalactionprofile even with no noise, which creates delays where agents have to wait for their average reward to go abovetheirλn. In the multitasking robots game, if agent n has Ren = 0, then theoptimalactionprofilea e hastosatisfya e,m = nforallm. Ifλisasafemarginawayfromthe boundary of C(G), then most agents will have Ren = 0 most of the time. Hence, their performance depends on the best action profile in SE.

action profile, artificial intelligence, ren, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.66)

Add feedback

QueueUpYourRegrets: AchievingtheDynamic CapacityRegionofMultiplayerBandits

Neural Information Processing SystemsFeb-7-2026, 07:17:36 GMT

Our main observation is that the gap between λnt and the accumulated reward of agent n, which we call the QoS regret, behaves like a queue.

agent, algorithm, artificial intelligence, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Cooperative Multi-player Bandit Optimization

Neural Information Processing SystemsDec-23-2025, 19:08:31 GMT

Consider a team of cooperative players that take actions in a networked-environment. At each turn, each player chooses an action and receives a reward that is an unknown function of all the players' actions. The goal of the team of players is to learn to play together the action profile that maximizes the sum of their rewards. However, players cannot observe the actions or rewards of other players, and can only get this information by communicating with their neighbors. We design a distributed learning algorithm that overcomes the informational bias players have towards maximizing the rewards of nearby players they got more information about. We assume twice continuously differentiable reward functions and constrained convex and compact action sets. Our communication graph is a random time-varying graph that follows an ergodic Markov chain. We prove that even if at every turn players take actions based only on the small random subset of the players' rewards that they know, our algorithm converges with probability 1 to the set of stationary points of (projected) gradient ascent on the sum of rewards function. Hence, if the sum of rewards is concave, then the algorithm converges with probability 1 to the optimal action profile.

cooperative multi-player bandit optimization, electronic proceedings, name change, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.77)

Add feedback

Bayesian Optimization for Non-Cooperative Game-Based Radio Resource Management

Zhang, Yunchuan, Chen, Jiechen, Liu, Junshuo, Qiu, Robert C.

arXiv.org Artificial IntelligenceDec-2-2025

Radio resource management in modern cellular networks often calls for the optimization of complex utility functions that are potentially conflicting between different base stations (BSs). Coordinating the resource allocation strategies efficiently across BSs to ensure stable network service poses significant challenges, especially when each utility is accessible only via costly, black-box evaluations. This paper considers formulating the resource allocation among spectrum sharing BSs as a non-cooperative game, with the goal of aligning their allocation incentives toward a stable outcome. To address this challenge, we propose PPR-UCB, a novel Bayesian optimization (BO) strategy that learns from sequential decision-evaluation pairs to approximate pure Nash equilibrium (PNE) solutions. PPR-UCB applies martingale techniques to Gaussian process (GP) surrogates and constructs high probability confidence bounds for utilities uncertainty quantification. Experiments on downlink transmission power allocation in a multi-cell multi-antenna system demonstrate the efficiency of PPR-UCB in identifying effective equilibrium solutions within a few data samples.

artificial intelligence, machine learning, utility function, (19 more...)

arXiv.org Artificial Intelligence

2512.01245

Country:

Europe (0.46)
Asia > China (0.29)

Genre: Research Report (1.00)

Industry:

Telecommunications (0.87)
Health & Medicine (0.84)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Communications > Networks (0.68)

Add feedback

Aspiration-based Perturbed Learning Automata in Games with Noisy Utility Measurements. Part A: Stochastic Stability in Non-zero-Sum Games

Chasparis, Georgios C.

arXiv.org Artificial IntelligenceNov-26-2025

Reinforcement-based learning has attracted considerable attention both in modeling human behavior as well as in engineering, for designing measurement- or payoff-based optimization schemes. Such learning schemes exhibit several advantages, especially in relation to filtering out noisy observations. However, they may exhibit several limitations when applied in a distributed setup. In multi-player weakly-acyclic games, and when each player applies an independent copy of the learning dynamics, convergence to (usually desirable) pure Nash equilibria cannot be guaranteed. Prior work has only focused on a small class of games, namely potential and coordination games. To address this main limitation, this paper introduces a novel payoff-based learning scheme for distributed optimization, namely aspiration-based perturbed learning automata (APLA). In this class of dynamics, and contrary to standard reinforcement-based learning schemes, each player's probability distribution for selecting actions is reinforced both by repeated selection and an aspiration factor that captures the player's satisfaction level. We provide a stochastic stability analysis of APLA in multi-player positive-utility games under the presence of noisy observations. This is the first part of the paper that characterizes stochastic stability in generic non-zero-sum games by establishing equivalence of the induced infinite-dimensional Markov chain with a finite dimensional one. In the second part, stochastic stability is further specialized to weakly acyclic games.

artificial intelligence, convergence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2511.11602

Country: